Goto

Collaborating Authors

 blind people


An Efficient Indoor Navigation Technique To Find Optimal Route For Blinds Using QR Codes

Idrees, Affan, Iqbal, Zahid, Ishfaq, Maria

arXiv.org Artificial Intelligence

Blind navigation is an accessibility application that enables blind to use an android Smartphone in an easy way for indoor navigation with instructions in audio form. We have proposed a prototype which is an indoor navigation application for blinds that uses QR codes. It is developed for android Smart phones and does not require any additional hardware for navigation. It provides automatic navigational assistance on pre-defined paths for blind. QR codes are placed on the floor sections after specific distance that acts as an input for current location detection and navigation. Whenever a QR code is scanned it provides the user with the information of the current location and asks the user to select the destination and then offers optimal and shortest path using path finding algorithms. During navigation whenever the deviation from the proposed path is detected it prompts the user and guides back to the right path by comparing the current path with the generated path. All of the instructions throughout the application are provided in audio form to the user. The interface of the application is well built for blinds which makes the smart phones user-friendly and useable for blind people. The user interacts with the application through a specific set of user-friendly gestures for specific inputs and operations. At the end, we have performed comparison between different state of art approaches and concluded that our approach is more user friendly, cost effective and produced more accurate results.


Beyond Omakase: Designing Shared Control for Navigation Robots with Blind People

Kamikubo, Rie, Kayukawa, Seita, Kaniwa, Yuka, Wang, Allan, Kacorri, Hernisa, Takagi, Hironobu, Asakawa, Chieko

arXiv.org Artificial Intelligence

Autonomous navigation robots can increase the independence of blind people but often limit user control, following what is called in Japanese an "omakase" approach where decisions are left to the robot. This research investigates ways to enhance user control in social robot navigation, based on two studies conducted with blind participants. The first study, involving structured interviews (N=14), identified crowded spaces as key areas with significant social challenges. The second study (N=13) explored navigation tasks with an autonomous robot in these environments and identified design strategies across different modes of autonomy. Participants preferred an active role, termed the "boss" mode, where they managed crowd interactions, while the "monitor" mode helped them assess the environment, negotiate movements, and interact with the robot. These findings highlight the importance of shared control and user involvement for blind users, offering valuable insights for designing future social navigation robots.


EgoBlind: Towards Egocentric Visual Assistance for the Blind People

Xiao, Junbin, Huang, Nanxin, Qiu, Hao, Tao, Zhulin, Yang, Xun, Hong, Richang, Wang, Meng, Yao, Angela

arXiv.org Artificial Intelligence

We present EgoBlind, the first egocentric VideoQA dataset collected from blind individuals to evaluate the assistive capabilities of contemporary multimodal large language models (MLLMs). EgoBlind comprises 1,210 videos that record the daily lives of real blind users from a first-person perspective. It also features 4,927 questions directly posed or generated and verified by blind individuals to reflect their needs for visual assistance under various scenarios. We provide each question with an average of 3 reference answers to alleviate subjective evaluation. Using EgoBlind, we comprehensively evaluate 15 leading MLLMs and find that all models struggle, with the best performers achieving accuracy around 56\%, far behind human performance of 87.4\%. To guide future advancements, we identify and summarize major limitations of existing MLLMs in egocentric visual assistance for the blind and provide heuristic suggestions for improvement. With these efforts, we hope EgoBlind can serve as a valuable foundation for developing more effective AI assistants to enhance the independence of the blind individuals' lives.


A Neuralink Rival Says Its Eye Implant Restored Vision in Blind People

WIRED

For years, they had been losing their central vision--what allows people to see letters, faces, and details clearly. The light-receiving cells in their eyes were deteriorating, gradually blurring their sight. But after receiving an experimental eye implant as part of a clinical trial, some study participants can now see well enough to read from a book, play cards, and fill in a crossword puzzle despite being legally blind. Science Corporation, the California-based brain-computer interface company developing the implant, announced the preliminary results this week. When Max Hodak, Science's CEO and former president of Neuralink, first saw a video of a blind patient reading while using the implant, he was stunned.


ProgramAlly: Creating Custom Visual Access Programs via Multi-Modal End-User Programming

Herskovitz, Jaylin, Xu, Andi, Alharbi, Rahaf, Guo, Anhong

arXiv.org Artificial Intelligence

Existing visual assistive technologies are built for simple and common use cases, and have few avenues for blind people to customize their functionalities. Drawing from prior work on DIY assistive technology, this paper investigates end-user programming as a means for users to create and customize visual access programs to meet their unique needs. We introduce ProgramAlly, a system for creating custom filters for visual information, e.g., 'find NUMBER on BUS', leveraging three end-user programming approaches: block programming, natural language, and programming by example. To implement ProgramAlly, we designed a representation of visual filtering tasks based on scenarios encountered by blind people, and integrated a set of on-device and cloud models for generating and running these programs. In user studies with 12 blind adults, we found that participants preferred different programming modalities depending on the task, and envisioned using visual access programs to address unique accessibility challenges that are otherwise difficult with existing applications. Through ProgramAlly, we present an exploration of how blind end-users can create visual access programs to customize and control their experiences.


AccessShare: Co-designing Data Access and Sharing with Blind People

Kamikubo, Rie, Zeraati, Farnaz Zamiri, Lee, Kyungjun, Kacorri, Hernisa

arXiv.org Artificial Intelligence

Blind people are often called to contribute image data to datasets for AI innovation with the hope for future accessibility and inclusion. Yet, the visual inspection of the contributed images is inaccessible. To this day, we lack mechanisms for data inspection and control that are accessible to the blind community. To address this gap, we engage 10 blind participants in a scenario where they wear smartglasses and collect image data using an AI-infused application in their homes. We also engineer a design probe, a novel data access interface called AccessShare, and conduct a co-design study to discuss participants' needs, preferences, and ideas on consent, data inspection, and control. Our findings reveal the impact of interactive informed consent and the complementary role of data inspection systems such as AccessShare in facilitating communication between data stewards and blind data contributors. We discuss how key insights can guide future informed consent and data control to promote inclusive and responsible data practices in AI.


Memory-Maze: Scenario Driven Benchmark and Visual Language Navigation Model for Guiding Blind People

Kuribayashi, Masaki, Uehara, Kohei, Wang, Allan, Sato, Daisuke, Chu, Simon, Morishima, Shigeo

arXiv.org Artificial Intelligence

Visual Language Navigation (VLN) powered navigation robots have the potential to guide blind people by understanding and executing route instructions provided by sighted passersby. This capability allows robots to operate in environments that are often unknown a priori. Existing VLN models are insufficient for the scenario of navigation guidance for blind people, as they need to understand routes described from human memory, which frequently contain stutters, errors, and omission of details as opposed to those obtained by thinking out loud, such as in the Room-to-Room dataset. However, currently, there is no benchmark that simulates instructions that were obtained from human memory in environments where blind people navigate. To this end, we present our benchmark, Memory-Maze, which simulates the scenario of seeking route instructions for guiding blind people. Our benchmark contains a maze-like structured virtual environment and novel route instruction data from human memory. To collect natural language instructions, we conducted two studies from sighted passersby onsite and annotators online. Our analysis demonstrates that instructions data collected onsite were more lengthy and contained more varied wording. Alongside our benchmark, we propose a VLN model better equipped to handle the scenario. Our proposed VLN model uses Large Language Models (LLM) to parse instructions and generate Python codes for robot control. We further show that the existing state-of-the-art model performed suboptimally on our benchmark. In contrast, our proposed method outperformed the state-of-the-art model by a fair margin. We found that future research should exercise caution when considering VLN technology for practical applications, as real-world scenarios have different characteristics than ones collected in traditional settings.


"We are at the mercy of others' opinion": Supporting Blind People in Recreational Window Shopping with AI-infused Technology

Kamikubo, Rie, Kacorri, Hernisa, Asakawa, Chieko

arXiv.org Artificial Intelligence

Engaging in recreational activities in public spaces poses challenges for blind people, often involving dependency on sighted help. Window shopping is a key recreational activity that remains inaccessible. In this paper, we investigate the information needs, challenges, and current approaches blind people have to recreational window shopping to inform the design of existing wayfinding and navigation technology for supporting blind shoppers in exploration and serendipitous discovery. We conduct a formative study with a total of 18 blind participants that include both focus groups (N=8) and interviews for requirements analysis (N=10). We find that there is a desire for push notifications of promotional information and pull notifications about shops of interest such as the targeted audience of a brand. Information about obstacles and points-of-interest required customization depending on one's mobility aid as well as presence of a crowd, children, and wheelchair users. We translate these findings into specific information modalities and rendering in the context of two existing AI-infused assistive applications: NavCog (a turn-by-turn navigation app) and Cabot (a navigation robot).


VIAssist: Adapting Multi-modal Large Language Models for Users with Visual Impairments

Yang, Bufang, He, Lixing, Liu, Kaiwei, Yan, Zhenyu

arXiv.org Artificial Intelligence

Individuals with visual impairments, encompassing both partial and total difficulties in visual perception, are referred to as visually impaired (VI) people. An estimated 2.2 billion individuals worldwide are affected by visual impairments. Recent advancements in multi-modal large language models (MLLMs) have showcased their extraordinary capabilities across various domains. It is desirable to help VI individuals with MLLMs' great capabilities of visual understanding and reasoning. However, it is challenging for VI people to use MLLMs due to the difficulties in capturing the desirable images to fulfill their daily requests. For example, the target object is not fully or partially placed in the image. This paper explores how to leverage MLLMs for VI individuals to provide visual-question answers. VIAssist can identify undesired images and provide detailed actions. Finally, VIAssist can provide reliable answers to users' queries based on the images. Our results show that VIAssist provides +0.21 and +0.31 higher BERTScore and ROUGE scores than the baseline, respectively.


Improve accessibility for Low Vision and Blind people using Machine Learning and Computer Vision

Shukurov, Jasur

arXiv.org Artificial Intelligence

With the ever-growing expansion of mobile technology worldwide, there is an increasing need for accommodation for those who are disabled. This project explores how machine learning and computer vision could be utilized to improve accessibility for people with visual impairments. There have been many attempts to develop various software that would improve accessibility in the day-to-day lives of blind people. However, applications on the market have low accuracy and only provide audio feedback. This project will concentrate on building a mobile application that helps blind people to orient in space by receiving audio and haptic feedback, e.g. vibrations, about their surroundings in real-time. The mobile application will have 3 main features. The initial feature is scanning text from the camera and reading it to a user. This feature can be used on paper with text, in the environment, and on road signs. The second feature is detecting objects around the user, and providing audio feedback about those objects. It also includes providing the description of the objects and their location, and giving haptic feedback if the user is too close to an object. The last feature is currency detection which provides a total amount of currency value to the user via the camera.